Search CORE

2 research outputs found

An Improved Transformer-based Model for Detecting Phishing, Spam, and Ham: A Large Language Model Approach

Author: Jamal Suhaima
Wimmer Hayden
Publication venue
Publication date: 12/11/2023
Field of study

Phishing and spam detection is long standing challenge that has been the subject of much academic research. Large Language Models (LLM) have vast potential to transform society and provide new and innovative approaches to solve well-established challenges. Phishing and spam have caused financial hardships and lost time and resources to email users all over the world and frequently serve as an entry point for ransomware threat actors. While detection approaches exist, especially heuristic-based approaches, LLMs offer the potential to venture into a new unexplored area for understanding and solving this challenge. LLMs have rapidly altered the landscape from business, consumers, and throughout academia and demonstrate transformational potential for the potential of society. Based on this, applying these new and innovative approaches to email detection is a rational next step in academic research. In this work, we present IPSDM, our model based on fine-tuning the BERT family of models to specifically detect phishing and spam email. We demonstrate our fine-tuned version, IPSDM, is able to better classify emails in both unbalanced and balanced datasets. This work serves as an important first step towards employing LLMs to improve the security of our information systems

arXiv.org e-Print Archive

Performance Analysis of Machine Learning Algorithm on Cloud Platforms: AWS vs Azure vs GCP

Author: Jamal Suhaima
Wimmer Hayden
Publication venue: Digital Commons@Georgia Southern
Publication date: 27/04/2023
Field of study

The significance of adopting cloud technology in enterprises is accelerating and becoming ubiquitous in business and industry. Due to migrating the on-premises servers and services into cloud, companies can leverage several advantages such as cost optimization, high performance, and flexible system maintenance, to name a few. As the data volume, variety, veracity, and velocity are rising tremendously, adopting machine learning (ML) solutions in the cloud platform bring benefits from ML model building through model evaluation more efficiently and accurately. This study will provide a comparative performance analysis of the three big cloud vendors: Amazon Web Service (AWS), Microsoft Azure and Google Cloud Platform (GCP) by building regression models in each of the platforms. For validation purposes, i.e., training and testing the models, five different standard datasets from the UCI machine learning repository have been employed. This work utilizes the ML services of AWS Sage maker, Azure ML Studio and Google Big Query for conducting the experiments. Model evaluation criteria here include measuring R-squared values for each platform, calculating the error metrics (Mean Squared Error, Mean Absolute Error, Root Mean Squared Error etc.) and comparing the results to determine the best performing cloud provider in terms of ML service. The study concludes with presenting a comparative taxonomy of regression models across the three platforms

Georgia Southern University: Digital Commons@Georgia Southern